Data Science

What's that?


In [1]:
from IPython.display import Image
Image(url='http://static.squarespace.com/static/5150aec6e4b0e340ec52710a/t/51525c33e4b0b3e0d10f77ab/1364352052403/Data_Science_VD.png?format=750w')


Out[1]:

Read the full story:

http://lucafoschini.com/notebooks/Agile%20Data%20Science%20Meetup.slides.html#/

Source (Another feature of Notebooks: Make slides out of them!):

https://github.com/LucaFoschini/lucafoschini.github.io/blob/master/notebooks/Agile%20Data%20Science%20Meetup.ipynb

Getting started

This section focuses on the 'Hacking Skills' part of the Venn diagram -- we're going to get you up and running on Linux, using git and GitHub, and starting to work on Jupyter notebooks. This is all stuff which will be used for future sections in this bootcamp, so take notes!

First up: opening the terminal. Either search for a program named 'terminal' on your computer (more on that in the links below), or use 'Applications (top left corner of your screen) -> System Tools -> Terminal'

The terminal is a powerful and efficient replacement for the file explorer:


In [2]:
Image(url='https://upload.wikimedia.org/wikipedia/en/c/cb/Windows_Explorer_Windows_7.png')


Out[2]:

"In Unix, a word is worth a thousand clicks." Much like in a traditional file explorer, you will be located at a specific directory (e.g. 'Libraries' in the picture above), and can travel up and down through the directory structure, running or opening files with various programs. If you are new to the command line, or even just a little bit rusty, the following short introduction and tutorial will bring you up to speed:

Data Science from the command line

The Unix terminal is a very powerful thing, and the comparison to a file explorer is unfair. Get a feel for how much you can do with so little by running through the following tutorial, until the 'awk' section:

http://www.ibm.com/developerworks/aix/library/au-unixtext/

Unix texutils can go a long way. Here are seven examples of powerful little tools that can be run from the terminal:

http://jeroenjanssens.com/2013/09/19/seven-command-line-tools-for-data-science.html


In [ ]: